Notes (Week 3 Wednesday)
These are the notes I wrote during the lecture.
This week is *recursion* and *induction* (you can use these words mostly interchangably) - Recursive function definitions f(0) = 1 (B) f(n) = n + f(n-1) if n > 0 (R) They're called *recursive* because the symbol we're defining (here f) may occur on the RHS of the defining equations. *But* the occurences on the RHS must be for *smaller values* than on the LHS. Otherwise, the recursion doesn't bottom out. In math, recursion that doesn't bottom out may lead to incomplete or unsound definitions (don't do it!) In programming, you'll run forever. - Inductive definitions of relations Let ℕ be inductively defined by the rules: (B) 0 ∈ ℕ (I) If n ∈ ℕ then n+1 ∈ ℕ We take ℕ as the *smallest* set about which the rules (B) and (I) are valid. Many sets contain 0 and the successor of everything in it, i.e. ℝ would also satisfy rules (B) and (I). The reason such inductive definitions work is that there's always a unique least set that satisfies such sets of rules. Namely: the intersection of all of them. This will be the same thing as: the set of everything you can build by starting from nothing and using rules (B) and (I) finitely many times. This lecture: induction. We started on Monday with natural number induction Suppose I have a proof using induction from m upwards. ∀n ≥ m. P(n) by proving P(m) and ∀k≥m. P(k) ⇒ P(k+1) I could have done exactly the same thing using basic induction to prove: ∀n∈ℕ. P'(n) P'(n) ≡ P(n+m) Basic induction then requires me to prove (B) P'(0) which is P(m) which I already proved above (I) ∀k. P'(k) ⇒ P'(k+1) ...which is ∀k. P(k+m) ⇒ P'(k+m1) ...which is equivalent to ∀k≥m. P(k) ⇒ P'(k+1) ...which I already proved above. Q: How to say that P(x) is true in mathematical symbols? A: ⊧ P(x) (coming in the near future) Suppose we have a proof of: P(n) for every l:th n≥m using induction with "big steps" as on slide 49. Then we *could* replay the proof by proving the following using basic induction: ∀n ∈ ℕ. P'(n) where P'(n) ≡ P(n*l+m) F(k+4) = F(k+3) + F(k+2) = 2*F(k+2) + F(k+1) = 2*(F(k+1) + F(k)) = 3*F(k+1) + F(k) We need that 3 | 3*F(k+1) + F(k) Follows from the IH that 3 | F(k) We want to prove ∀n ∈ ℕ. P(n) Strong induction then requires us to prove (I) ∀k∈ℕ. (∀m ∈ ℕ. m < k ⇒ P(m)) ⇒ P(k) ^ We need to prove P(k) from the assumption that P holds about every number smaller than k. Q: Why is there no base case here? Isn't that spooky? A: Actually, there is one! It's (I), but when k is 0. (∀m ∈ ℕ. m < 0 ⇒ P(m)) ⇒ P(0) ≡ (∀m ∈ ℕ. ⊥ ⇒ P(m)) ⇒ P(0) ⊥ means "false" ≡ (∀m ∈ ℕ. ⊤) ⇒ P(0) ⊤ means "true" ≡ ⊤ ⇒ P(0) ≡ P(0) It's not that the base case isn't in there. It's just that we don't need to treat it separately. ⊥ it's pronounced "bottom" ⊤ it's pronounced "top" What we've called "basic induction" is a special case of structural induction. Specifically: basic induction is structural induction over the ℕs ≺ is *reflexive* if ∀x. x ≺ x *anti-reflexive* if ∀x. ¬ (x ≺ x) *non-reflexive* if ∃x. ¬ (x ≺ x) (ℕ,<) is a strict poset. It's also well-founded, because if we choose your favourite number (say 6) 0 .. < 3 < 4 < 6 ^ there's no way we can make a chain that descends forever along < (ℤ,<) not a wfo because we can make infinitely descending chains ... < -126216 < -1 < 0 < 6 (ℝ+ ∪ {0},<) is *not* well-founded, despite being a poset, and despite having a bottom element .. < 1/4 < 1/2 < 1 Well-founded relations are closely related to termination. (they're in fact the same thing!) Suppose we have a poset (S, ≺) where S is the *state space* of a computer program (the set of all states that could occur during the execution of a program) s' ≺ s means "starting from the state s, it's possible to reach the state s' by executing the program a little bit" Suppose we had an infinitely descending chain along ≺. Q: What does that say about the program? A: The program might not terminate! Suppose we know that (S, ≺) is a well-founded order. Q: What does that say about the program? A: It always terminates. If every chain in S (aka every possible execution sequence) always bottoms out in a state where we can't keep computing. That means we *must* terminate! Hence: the minimal elements of the poset are the final states (halting states) In other words: the class of programs we can build such a wfo on is the class of programs that's guaranteed to terminate Well-founded induction. Because of antisymmetry, we can't have programs that revisit the exact same state ordered by a strict relation. Hence: not all programs would be captured by the above. But there's two ways around this: - we don't care because we were specifically trying to describe the terminating programs - we could chuck a counter onto our state that increments with every step Theorem (Emmy Noether? Or folk?) The following three statements are equivalent: 1. (S,≺) is a strict poset with no infinite descending descending chains 2. (S,≺) is a strict poset where every non-empty subset of S has a minimal element 3. The following induction principle is valid: (well-founded induction or Noetherian induction) To prove ∀s ∈ S. P(s) It suffices to prove ∀s ∈ S. (∀s' ∈ S. s' ≺ s ⇒ P(s')) ⇒ P(s) In other words: (1) "it terminates" is equivalent to (3) "we can use induction to reason about it" Complete induction on ℕ is the same thing as well-founded induction on (ℕ,<) We could use this to derive another induction principle for ℤ. (ℤ,≺) Where x ≺ y ≡ |x| < |y| |x| means "absolute value of" ...this is a WFO (follows from the fact that (ℕ,<) is one) Therefore: well-founded induction on this order gives us an induction principle: "induction on the distance from 0" In programming: if v = "he" and w = "llo" v + w = "hello" In formal languages v = he w = llo vw = hello or sometimes v⬝w = hello (he)⬝(llo) = (concat.I) h((e)⬝llo) = (concat.I) h(e((λ)⬝llo)) = (concat.B) h(e(llo)) = by associativity hello Q: Do the definitions have to correspond to the inductive definition of strings? A: You'll notice that we used exactly one recursive equation per clause of the definition of the set of strings. We don't have to do this, but it makes life easier for two reasons: (1) a recursive function of this shape must terminate. (2) it's easier to do proofs about it when our functions and our induction rules have the same shape. reverse(hello) = olleh reverse(llo)⬝reverse(he) = oll⬝eh = olleh reverse(he) = reverse(e)⬝h = reverse(λ)⬝e⬝h = λ⬝e⬝h = ehe